Lenience towards Teammates Helps in Cooperative Multiagent Learning

نویسندگان

  • Liviu Panait
  • Keith Sullivan
  • Sean Luke
چکیده

Concurrent learning is a form of cooperative multiagent learning in which each agent has an independent learning process and little or no control over its teammates’ actions. In such learning algorithms, an agent’s perception of the joint search space depends on the reward received by both agents, which in turn depends on the actions currently chosen by the other agents. The agents will tend to converge towards certain areas of the space because of their learning processes. As a result, an agent’s perception of the search space may benefit if computed over multiple rewards at early stages of learning, but additional rewards have little impact towards the end. We thus suggest that agents should be lenient with their teammates: ignore many of the low rewards initially, and fewer rewards as learning progresses. We demonstrate the benefit of lenience in a cooperative coevolution algorithm and in a new reinforcement learning algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Autonomous Learning Agents: Layered Learning and Ad Hoc Teamwork

In order to achieve long-term autonomy in the real world, fully autonomous agents need to be able to learn, both to improve their behaviors in a complex, dynamically changing world, and to enable interaction with previously unfamiliar agents. This talk begins by presenting layered learning, a hierarchical machine learning paradigm that enables learning of complex behaviors by incrementally lear...

متن کامل

Grounded Semantic Networks for Learning Shared Communication Protocols

Cooperative multiagent learning poses the challenge of coordinating independent agents. A powerful method to achieve coordination is allowing agents to communicate. We present the Grounded Semantic Network, an approach for learning a task-dependent communication protocol grounded in the observation space and reward function of the task. We show that the grounded semantic network effectively lea...

متن کامل

Efficient Behavior Learning Based on State Value Estimation of Self and Others

The existing reinforcement learning methods have been seriously suffering from the curse of dimension problem especially when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behavior easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent enviro...

متن کامل

Efficient Behavior Learning by Utilizing Estimated State Value of Self and Teammates

Reinforcement learning applications to real robots in multiagent dynamic environments are limited because of huge exploration space and enormously long learning time. One of the typical examples is a case of RoboCup competitions since other agents and their behavior easily cause state and action space explosion. This paper presents a method that utilizes state value functions of macro actions t...

متن کامل

Adapting Plans through Communication with Unknown Teammates: (Doctoral Consortium)

Coordinating a team of autonomous agents is a challenging problem. Agents must act in such a way that makes progress toward the achievement of a goal while avoiding conflict with their teammates. In information asymmetric domains, it is often necessary to share crucial observations in order to collaborate effectively. In traditional multiagent systems literature, these teams of agents share an ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005